Mining Wikipedia Article Clusters for Geospatial Entities and Relationships
نویسندگان
چکیده
We present in this paper a method to extract geospatial entities and relationships from the unstructured text of the English language Wikipedia. Using a novel approach that applies SVMs trained from purely structural features of text strings, we extract candidate geospatial entities and relationships. Using a combination of further techniques, along with an external gazetteer, the candidate entities and relationships are disambiguated and the Wikipedia article pages are modified to include the semantic information provided by the extraction process. We successfully extracted location entities with an F-measure of 81%, and location relations with an F-
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملHow much information is geospatially referenced? Networks and cognition
This paper presents two approaches that aim to capture the portion of geospatially referenced information in user-generated content: a network approach and a cognitive approach. The German Wikipedia is used as research corpus, which is considered a network with the articles being nodes and the links being edges. The Network Degree of Geospatial Reference (NDGR) is introduced as a measurement un...
متن کاملSubtree Mining for Relation Extraction from Wikipedia
In this study, we address the problem of extracting relations between entities fromWikipedia’s English articles. Our proposed method first anchors the appearance of entities in Wikipedia’s articles using neither Named Entity Recognizer (NER) nor coreference resolution tool. It then classifies the relationships between entity pairs using SVM with features extracted from the web structure and sub...
متن کاملA clustering approach for mineral potential mapping: A deposit-scale porphyry copper exploration targeting
This work describes a knowledge-guided clustering approach for mineral potential mapping (MPM), by which the optimum number of clusters is derived form a knowledge-driven methodology through a concentration-area (C-A) multifractal analysis. To implement the proposed approach, a case study at the North Narbaghi region in the Saveh, Markazi province of Iran, was investigated to discover porphyry ...
متن کاملMining and Explaining Relationships in Wikipedia
Mining and explaining relationships between concepts are challenging tasks in the field of knowledge search. We propose a new approach for the tasks using disjoint paths formed by links in Wikipedia. Disjoint paths are easy to understand and do not contain redundant information. To achieve this approach, we propose a naive method, as well as a generalized flow based method, and a technique for ...
متن کامل